-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ready signal for cluster announce and publish #175
Conversation
Why? |
I've made my point wrt service startup ordering and dependencies in the previous PR. What's yours? What problem is it going to resolve? |
Because we don't really need to reannounce in case of unicorn, but scheme with timer forces us to do it. Besides which interval of timer is the best? 1s? 60s? I think better to make announce right after context bootstrap. |
This PR does not add any dependencies or ordering in fact. It just add ability to notify about context bootstrap termination, which is the best time to announce services. |
Are you trying to say that once written, the announcement information will always stay in ZK? This is not true. |
Also this IS ordering, specifically one form of it called checkpointing. |
What would happen with information in zookeeper? |
And what's really harmfull in your point of view about this kind of checkpointing? It does not involve new dependencies, as cluster_t is already dependent on locator. Ok, another option is to pass to api::cluster_t constructor instance of locator, as it is really required for cluster_t's functionality. But imho signal is a more general approach. |
You don't need a signal to make it work. You have to use a timer to make it fault-tolerant. |
Fault-tolerancy is guaranted by zookeper watch. If announce information disappears or connection is reset - we make reannounce. We don't need to do it by timer. |
From my point of view timer is more expensive and looks more like a crutch than a signal or even passing locator to cluster_t's constructor. |
Anything. It could be lost, it could be corrupted, it could be deleted, it could be never actually written in there. You can not rely on that for sure. |
Zookeeper makes some guarantees. If we can not rely on that guarantees - we can not use zookeeper. It guarantees that in case I've issued write command and it succeeded - data is there. It also guarantees that in case anything happens with connection OR data in that node - we will get a watch triggering. |
And there are no bugs in ZK, yes. And network is completely reliable, and disks are 100% redundant. And there are butterflies shitting rainbows in the sky instead of snow blizzard in the end of March. |
That's not about reliability. The point is - if the problem will occur with ZK timer will not help with this situation(there is a problem with ZK, yes). If there is no problem with ZK - the situation is handled properly without timer. |
So, first of all, I don't like signals in general. They were used here to solve one thing and apparently the amount of mess they introduce is much worse than their benefits. There's a task in Trello to replace them for the next release: https://trello.com/c/5kOeRUxo. The mess I'm talking about is mainly about synchronization requirements signals impose on the code, for example in your Unicorn cluster implementation, you have a data race in https:/cocaine/cocaine-plugins/pull/57/files#diff-8305a92f02ba42856c2a1ea2414d4323R242 and probably other parts because of that. I doubt you have ever seen it happen because it's in an pessimistic code path, but still. |
So I'm not going to accept it, again. |
Just to clarify - I beleived that previous PR was not accepted because of strict order of service start only, not because of ready signal.
Summarizing.
|
It doesn't follow that dropped watch notification packets will be resent from the link you've posted. Note that I'm not talking about connection loss. Have you done tests against such cases using |
"...There is one case where a watch may be missed: a watch for the existance of a znode not yet created will be missed if the znode is created and deleted while disconnected..." implies that in our case all watches will be delivered (or at least connection loss watch will be triggered on client - that is also ok). I will make several more tests tomorrow to check if this is true.
|
There's no implicative relationship between these two statements, only assumptions. Timers are not the best solution obviously, you're right, but disregarding possible ZK failures is not a good thing to do as well, and these failures DO happen. Whether or not ZK client can workaround them is something you can find out via testing. I'm not saying that ZK watches should be removed, obviously they are a very good thing for us to eliminate unnecessary polling and excessive traffic, however the argument about zero-time notifications does not have much ground since we never seen an usecase where this became an issue. Actually, I see a potential regression here since, for example, network flapping would trigger ephemeral ZK nodes to appear and disappear from ZK (if you use them, I don't know) which in turn would trigger flapping in metalocator's internal cluster representation and flood the kernel with IPVS requests. Anyway, I can accept this PR if you rework it in such a way to avoid possible data races emerging from the use of synchronous signals, e.g. by posting an action from the signal handler and if you can actually accumulate enough data to prove that general ZK weirdness won't affect metalocator consistency (which is probably for @oktocat to decide). P.S. ZK is based on ZAB, not RAFT. |
BTW, the easiest way to make signals async-safe would be to boost::asio::io_service::wrap() the handlers. At least, that's what I'm going to do as the first step to resolve this signal synchronization mess. |
Can you please explain which data race are you talking about in case of unicorn? As I see we connect slot from unicorn_cluster constructor and invoke signal later when construction is finished. As for concurrent connecting slot and invoking signal I think it's enough to move call and connect under mutex, isn't it? https:/cocaine/cocaine-core/blob/v0.12/include/cocaine/context/signal.hpp#L69 . In general it will not affect performance as they rarely overlap. |
JFYI:
I've tested via this script. After disabling firewall reannounce occurs. Flapping via this script in a cycle also leads to reannounce when firewall is disabled. |
…s finally discussed and merged or rejected.
I'm talking about See, that's exactly what I'm talking about w.r.t. signal concurrency mess and why adding a new signal is a not the best idea. |
I still can not understand. We have multiple chambers (execution units) and all async handlers CAN run simultaniously in different thread, so ANY access to asio objects particulaary and other thread unsafe objects generally should be synchronized or guaranteed to be executed in one thread. This is not related to signals or anything else. Just if the code can run at the same time from different threads it should be thread safe. |
For asio chambers (engine & service), we use a share-nothing architecture. The only thing engine chambers do share is service dispatches, which are synchronized. But signals introduce an additional sharing point for threads, specifically signal handlers are invoked from the context thread, but access objects from chambers' threads. So, for signal handlers, either additional synchronization or asynchronous dispatch should be used to avoid data races. Let me explain in greater detail: given a running service, it accepts new connections in the service chamber and then dispatches them to one of the engine chambers. From this point, the connection is owned by the engine chamber and shares nothing with anything else in the process, except its service dispatch, which is already synchronized in the message dispatch path and should be synchronized in the actual service by the service author. If you connect to a signal from such service, then the signal handler will be invoked from a completely unrelated thread and at the same time it would access previously unshared objects from the service or engine chamber (in your case, service chamber where you do timers and other asio stuff for the cluster manager). These objects are not synchronized, hence the race. |
I'm working on asynchronous signal implementation right now which will solve this problem once and for all, but RIGHT NOW, it's not safe to simply connect signals and use them. |
I think it's a better approach to announce locator the first time. It can also increase startup speed and it is needed by upcoming unicorn based discovery.